[SPARK-31390][SQL][DOCS] Document Window Function in SQL Syntax Section#28220
[SPARK-31390][SQL][DOCS] Document Window Function in SQL Syntax Section#28220huaxingao wants to merge 9 commits intoapache:masterfrom
Conversation
|
Test build #121296 has finished for PR 28220 at commit
|
|
cc @maropu |
|
also cc: @viirya |
docs/sql-ref-syntax-qry-window.md
Outdated
| **This page is under construction** | ||
| ### Description | ||
|
|
||
| Similarly to aggregate functions, window functions operate on a group of rows. However, unlike aggregate functions, window functions perform aggregation without reducing, calculating a return value for each row in the group. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative, or accessing the value of rows given the relative position of the current row. Spark SQL supports three types of window functions: |
There was a problem hiding this comment.
How about Similarly to aggregate functions, window functions operate on a group of rows. -> A window function operates on a group of rows and this is comparable to aggregate functions. ?
There was a problem hiding this comment.
window functions perform aggregation without reducing, calculating a return value for each row in the group. is not clear. This means window functions do not compute a single aggregated value. Instead, they can generate multiple aggregated values for each group?
|
Test build #121331 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
| **This page is under construction** | ||
| ### Description | ||
|
|
||
| Similarly to aggregate functions, window functions operate on a group of rows. However, unlike aggregate functions, window functions perform aggregation without reducing, calculating an aggregated value for each row in the specified window. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative, or accessing the value of rows given the relative position of the current row. Spark SQL supports three types of window functions: |
There was a problem hiding this comment.
"without reducing"? Sounds confusing. How about "without reducing the number of rows"?
And "but calculating an aggregated value for each row in the specified window."
|
Test build #121333 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
| ### Description | ||
|
|
||
| Window functions operate on a group of rows, referred to as a window, and calculate an aggregated value for each row based on the specified window. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative, or accessing the value of rows given the relative position of the current row. Spark SQL supports three types of window functions: | ||
|
|
There was a problem hiding this comment.
Also cc @srowen
Please feel free to rephrase. Thanks!
There was a problem hiding this comment.
computing a cumulative -> computing a cumulative sum (or anything similar: average, statistic)
There was a problem hiding this comment.
Thanks, it looks better. How about putting the last statement in a new line?;
...the current row.
Spark SQL supports three types of window functions:
* Ranking Functions
* Analytic Functions
* Aggregate Functions
There was a problem hiding this comment.
We need this list here? The Syntax section has the same list.
|
Test build #121337 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
| Specifies a comma separated list of key and value pairs for partitions.<br><br> | ||
| <b>Syntax:</b><br> | ||
| <code> | ||
| { PARTITION | DISTRIBUTE } BY partition_col_name = partition_col_val ( [ , ... ] ) |
There was a problem hiding this comment.
nit: I found the double spaces in this line.
docs/sql-ref-syntax-qry-window.md
Outdated
| MAX | MIN | COUNT | SUM | AVG | ... | ||
| </code> | ||
| <br> | ||
| Please refer <a href="api/sql/">here</a> for a complete list of Spark Aggregate Functions. |
There was a problem hiding this comment.
nit: here -> the Built-in Function document?
There was a problem hiding this comment.
nit: Spark Aggregate Functions. -> Spark aggregate functions.?
There was a problem hiding this comment.
I will put sql-ref-functions-builtin.html as the link for Built-in Function document. It's broken now but will work after your PR is in.
There was a problem hiding this comment.
Ur, could you revert the link back? I'm currently not sure that my PR is target at 3.0.
docs/sql-ref-syntax-qry-window.md
Outdated
| ### Description | ||
|
|
||
| Window functions operate on a group of rows, referred to as a window, and calculate an aggregated value for each row based on the specified window. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative, or accessing the value of rows given the relative position of the current row. Spark SQL supports three types of window functions: | ||
|
|
There was a problem hiding this comment.
Thanks, it looks better. How about putting the last statement in a new line?;
...the current row.
Spark SQL supports three types of window functions:
* Ranking Functions
* Analytic Functions
* Aggregate Functions
|
Test build #121342 has finished for PR 28220 at commit
|
|
Test build #121344 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
| <dt><code><em>window_function</em></code></dt> | ||
| <dd> | ||
| <ul> | ||
| <li> Ranking Functions </li> |
There was a problem hiding this comment.
nit: <li> Ranking Functions </li> -> <li>Ranking Functions</li>
docs/sql-ref-syntax-qry-window.md
Outdated
| MAX | MIN | COUNT | SUM | AVG | ... | ||
| </code> | ||
| <br> | ||
| Please refer to the <a href="sql-ref-functions-builtin.html">Built-in Function</a> document for a complete list of Spark aggregate functions. |
There was a problem hiding this comment.
nit: Built-in Function -> Built-in Functions by referring to the title in the doc: https://spark.apache.org/docs/latest/api/sql/index.html
docs/sql-ref-syntax-qry-window.md
Outdated
| Specifies an ordering of the rows.<br><br> | ||
| <b>Syntax:</b><br> | ||
| <code> | ||
| { ORDER | SORT } BY { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ] } |
There was a problem hiding this comment.
Could you move ORDER BY and PARTITION BY caluses into the Syntax section like the Pg doc one?
[ existing_window_name ]
[ PARTITION BY expression [, ...] ]
[ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ frame_clause ]
docs/sql-ref-syntax-qry-window.md
Outdated
| **This page is under construction** | ||
| ### Description | ||
|
|
||
| Window functions operate on a group of rows, referred to as a window, and calculate an aggregated value for each row based on the specified window. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative statistic, or accessing the value of rows given the relative position of the current row. |
There was a problem hiding this comment.
"... calculate a return value for each row based on a group of rows"
|
Test build #121352 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
| UNBOUNDED { PRECEDING | FOLLOWING } | ||
| | CURRENT ROW | ||
| | boolean_expression { PRECEDING | FOLLOWING } | ||
| </code> <br><br> |
There was a problem hiding this comment.
I think we need to describe what these clauses (RANGE, ROWS, BETWEEN, ...) are.
|
Test build #121371 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
| ### Examples | ||
|
|
||
| {% highlight sql %} | ||
|
|
|
Could you update the screenshot in the description, too? |
|
Test build #121393 has finished for PR 28220 at commit
|
|
@maropu I have addressed the last two comments and updated the screenshots in description. Thanks for reviewing! |
|
cc @srowen for final sign off. |
docs/sql-ref-syntax-qry-window.md
Outdated
| +-----+-----------+------+-----+ | ||
|
|
||
| SELECT name, salary, | ||
| LAG(salary) OVER (PARTITION BY dept ORDER BY salary) as lag, |
There was a problem hiding this comment.
as -> AS
but definitely don't change it just for that. Looks fine. I'll merge shortly
docs/sql-ref-syntax-qry-window.md
Outdated
| +-----+-----------+------+----------+ | ||
|
|
||
| SELECT name, dept, age, CUME_DIST() OVER (PARTITION BY dept ORDER BY age | ||
| RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as cume_dist FROM employees; |
|
Test build #121433 has finished for PR 28220 at commit
|
### What changes were proposed in this pull request? Document Window Function in SQL syntax ### Why are the changes needed? Make SQL Reference complete ### Does this PR introduce any user-facing change? Yes <img width="1050" alt="Screen Shot 2020-04-16 at 9 13 34 PM" src="https://user-images.githubusercontent.com/13592258/79531509-7bf5af00-8027-11ea-8291-a91b2e97a1b5.png"> <img width="1050" alt="Screen Shot 2020-04-16 at 9 14 12 PM" src="https://user-images.githubusercontent.com/13592258/79531514-7e580900-8027-11ea-8761-4c5a888c476f.png"> <img width="1050" alt="Screen Shot 2020-04-16 at 9 14 45 PM" src="https://user-images.githubusercontent.com/13592258/79531518-82842680-8027-11ea-876f-6375aa5b5ead.png"> <img width="1050" alt="Screen Shot 2020-04-16 at 9 15 10 PM" src="https://user-images.githubusercontent.com/13592258/79531521-844dea00-8027-11ea-8948-712f054d42ee.png"> <img width="1050" alt="Screen Shot 2020-04-16 at 9 15 25 PM" src="https://user-images.githubusercontent.com/13592258/79531528-8748da80-8027-11ea-9dae-a465286982ac.png"> ### How was this patch tested? Manually build and check Closes #28220 from huaxingao/sql-win-fun. Authored-by: Huaxin Gao <huaxing@us.ibm.com> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org> (cherry picked from commit 142f436) Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
|
Thanks! Merged to master/3.0. |
|
Thanks, all! |
What changes were proposed in this pull request?
Document Window Function in SQL syntax
Why are the changes needed?
Make SQL Reference complete
Does this PR introduce any user-facing change?
Yes
How was this patch tested?
Manually build and check